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GENERALIZED  MAXIMUM  LIKELIHOOD  METHODS  WITH 
EXACT  JUSTIFICATIONS  ON  TWO  LEVELS 


1.   Introduction  and  summary.   This  paper  is  an  expository 
accovint  of  recent  extensions  of  the  theory  of  estimation  [l]  and  of 
the  foundations  of  statistical  inference  [2],      This  work  exhibits 
in  different  ways,  and  on  different  theoretical  levels,  the  central 
position  of  the  likelihood  function  as  the  objective  basis  for 
efficient  statistical  inference,  as  well  as  giving  new  practical 
techniques  of  statistical  inference, 

2«   Likelihood  methods  with  objective  .1ustificatlons«  We  con- 
sider first  the  familiar  problem  of  estimation  of  a  real-valued 
parameter  Q,   from  an  outcome  x  of  a  specified  statistical  experiment 
E  (which  may  be  sequential),  represented  by  probability  density 
f\mctions  f(x,P)  (with  respect  to  a  fixed  measure  on  a  specified 
sample  space  S  =  |x  r  ),  Q  in  some  interval  Jl.,   A  simple  broad  basis 
for  appraising,  any  estimator  0^""  =  0'"'{x)    is  given  by  the  various 
probabilities  of  its  errors  of  overestlraation  and  underestimation 
by  various  amounts : 

iProb  [o'Hx)  ^  u|a],  if  u  <  0, 
Prob  [P''''(X)  >  u|0],  if  u  >  0, 
0,  if  u  =  o, 

for  each  0  and  u.   Such  functions  are  called  the  risk  curves  of  an 
estimator  i^'"' ,   and  are  simply  a  representation  of  the  cumulative  dis- 
tribution functions  of  the  estimator.   The  general  goal  in  the 
estimation  problem  is  to  choose  O'^   so  as  to  minimize  simultaneously 
as  far  as  possible  all  the  quantities  a(u,a^a'''') ,   in  non-trivial 
problems,  such  appraisal  and  comparison  of  estimators  leads  not  to 
a  simple  ordering  but  to  a  partial  ordering  of  estimators;  for 
example,  errors  of  overestlmation  can  be  reduced  in  general  only  at 
the  cost  of  increasing  errors  of  underestimation.   In  typical 
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problems,  one  is  led  to  consideration  of  a  rather  large  class  of 
admissible  estimators,  which  includes  confidence  limit  estimators 
as  well  as  point  estimators. 

The  simplest  approach  to  the  problem  of  jointly  minimizing  such 
error-probabilities  begins  with  the  consideration  of  three  values 

®o'  S^®o^'  ^"^  ^2^^o^  °^  '^'    ^l^^o^  =  %  =  ^2^^o^'  ^^^  *^®  *^° 
error-probabilities  a(P  ,©^,P'"")  and  a(P  -Op,©'''),   This  problem  is 

solved  by  direct  application  of  the  fundamental  lemma  of  Neyman  and 

Pearson:   We  define  generalized  score  statistics: 

S(x,0^,o^)  =  [log  f{x,Q^)    -  log  f(x,a^)]/(02  -  Q^)  if  c^  ^  Q^, 
and 

S(x,o^,0^)  =  S(x,P^)  =^   log  f(x,0^)  if  ©2  =  a^. 

Then  the  two  error-probabilities  mentioned  are  jointly  minimized  in 
the  usual  sense  by  any  estimator  Q'"{x)    such  that  0""(x)  ^  P  if  and 
only  if  S(x,o^,P^)  s  <^(^o)*  where  G(0  )  is  any  constant.   In 
problems  of  moderate  complexity,  we  can  choose  the  functions 
0-(©),  ep(0),  and  G(Q)  suitably  to  define  an  admissible  estimator 
0-"(x)  as  the  \mique  root  of  the  equation  S{x,Q^{0),    "^^C*?))  =  G^^). 
Various  choices  of  these  functions  give  different  emphases  to 
various  errors  of  estimation;  the  choice  of  G(Q)  as  the  median  of 
the  generalized  score  gives  a  median-unbiased  admissible  estimator; 
taking  G(fl)  as  the  .O^-quantile  of  the  score  gives  an  admissible 
upper  95  *yo  confidence  limit  estimator;  etc.   Taking  ©,(0)  =  ^p^'^^ 
gives  estimators  which  are  locally-best  (minimizing  probabilities 
of  Inf inlteslmally  small  errors  of  estimation);  taking  also 
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G(e)  s  0  gives  the  maximum  likelihood  estimator,  which  is  thus 
shown  to  be  admissible  (non-asymptotically )  along  with  many  other 
estimators* 

In  problems  with  suitable  symmetry,  such  as  the  estimation  of 
the  median  of  a  double -exponential  distribution,  -^   exp  -  ■2ly  -  ^l» 
or  the  median  of  a  logistic  distribution,  (1  +  exp  -  (y  -  0) 
taking  G(o)  e  0  and  Q-^{9)   =  o  -  c,  02^'^)  =  ^  +  c,  for  any  fixed 
c  >  0,  an  admissible  median  unbiased  estimator  is  defined  as  the 
solution  0'-'"(x)  of  the  equation  S(x,2  -  c,  Q  +  c)  =0,   Such  an 
estimator  is  a  functional  of  the  logarithm  of  the  likelihood 
function  f(x,o),  x  =  (y-,,...y  )  (n  independent  observations),  which 
can  be  described  simply  as  the  value  of  a  at  which  the  difference 
quotient  S(x,0  _  c,  0  +  c)  of  log  f(x,0)  equals  0;  in  this  sense, 
with  each  different  choice  of  c,  we  obtain  in  general  a  different 
Incomplete  description  of  the  form  of  the  likelihood  fimction. 

While  this  approach,  and  related  generalizations  [1],  solves 
some  theoretical  problems  of  estimation  and  simplifies  others,  and 
supplies  some  nev;  practical  techniques  of  estimation  having  exact 
objective  justifications,  it  also  leads  to  dlfflcvilt  problems  of 
choice  of  an  estimator  from  a  large  admissible  class,  and  illustrate 
anew  the  practical  Importance  of  basic  problems  in  the  foundations 
of  statistical  inference. 

3«   Likelihood  methods  with  intrinsic  justifications.   In  the 
modern  theory  of  probability  and  its  standard  applications  in  the 
empirical  sciences,  the  concept  of  an  experiment  occupies  a  central 
and  basic  position.  The  term  probability  experiment  is  useful  to 
denote  any  completely  specified  mathematical  probability  model. 
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consisting  of  a  specified  sample  space  s  =  |^x^y  of  possible  outcomes 
X,  a  suitable  family  of  subsets  A  of  S,  and  a  probability  function 
P(A),  defined  on  those  sets,  ^^rhich  satisfies  certain  axioms.   The 
term  statistical  experiment  is  useful  to  denote  a  specified  set  of 
txiTO  or  more  probability  experiments,  having  the  same  sample  space 
and  family  of  subsets  but  in  general  different  probability  functions; 
each  such  probability  experiment  may  be  labeled  by  a  parameter  point 
©,  and  the  parameter  space  -0-=  sPr  is  the  set  of  such  labels; 
in  this  context,  each  such  probability  experiment  represents  a 
(simple)  statistical  hypothesis  H^.   The  problems  of  statistical 
Inference  treated  by  mathematical  statistics  are  formulated  on  the 
basis  of  specified  statistical  experiments,   (This  includes  problems 
of  experimental  design,  which  concern  the  appraisal  and  comparison 
of  alternative  statistical  experiments,) 

The  character  of  objectivity  which  is  basic  to  modern 
probability  theory  has  two  soiorces:   Its  mathematical  structure, 
based  on  unequivocal  and  consistent  formal  definitions  of  its  terms, 
may  be  called  mathematically  objective,  in  contrast  with  some  earlier 
versions  of  probability  theory,  VJhen  a  probability  experiment  is 
used  in  relation  to  a  physical  phenomenon,  with  its  mathematical 
elements  linked  (directly  or  indirectly)  to  physical  entities  or 
events  which  are  observable  or  verifiable  (in  fact  or  in  principle), 
through  unequivocal  and  consistent  coordinating  definitions,  the 
resulting  interpreted  mathematical  model  may  be  called  physically 
objective.     . 

The  character  of  objectivity  v;hich  Is  a  basic  featixre  and  goal 
of  modern  mathematical  statistics  is  based  first  of  all  on  these 
features  of  modern  probability  theory,  and  their  interpretations  in 
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the  contexts  of  statistical  experiments  and  the  situations  where 
these  are  applied.   Discussions  of  statistical  inference  problems 
which  do  not  have  specified  statistical  experiraents  as  their  frames 
of  reference  are  usually  considered  ujisatisfactory,  and  lacking  in 
objectivity,   ilevertheless  there  is  continuing  dissatisfaction  and 
disagreement  concerning  the  foiondations  of  mathematical  statistics 
as  a  theory  of  statistical  inference.  We  shall  illustrate  here, 
first  by  discussion  of  a  simple  concrete  example  of  a  statistical 
experiment,  and  then  in  terms  of  a  general  mathematical  theorem 
which  the  example  illustrates,  that  for  one  Important  category  of 
inference  problems,  the  concept  of  a  statistical  experiment,  with 
its  probability  terms  interpreted  in  the  usual  objective  ways,  is 
lacking;  in  objectivity  in  a  relevant  sense  which  can  be  demon- 
strated mathematically  (and  physically);  and  that  mathematical 
analysis  leads  to  a  different  basis  which  is  more  objective  and 
satisfactory  for  such  problems  of  statistical  inference. 

To  simplify  all  but  the  central  issues  here,  we  consider 
binary  statistical  experiments  (those  in  which  just  t;^J0  simple 
hypotheses,  H^  or  Hp,  appear).  A  simple  binary  experiment  is  one 
in  which  the  sample  space  S  contains  only  two  points 3  except  in  the 
trivial  case  that  the  hypotheses  are  equivalent,  one  point,  to  be 
called  "positive",  has  larger  probability  imder  H2  than  under  H, j 
the  other  point  vrill  be  called  "negative".   Each  simple  binary 
experiment  is  represented  by  a  pair  (a,p),  where  a  is  the  probability 
of  a  "false  positive"  (or  a  Type  I  error),  and  p  is  the  probability 
\ander  Hp  of  "negative",  that  is,  the  probability  of  a  "false  nega- 
tive" (or  a  Type  II  error).   For  any  a,  (a,l-a)  represents  a  trivial 
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experiment  in  ^^rhich  H,  and  Hp  are  equivalent.  For  applications 
such  as  the  detection  of  presence  or  absence  of  some  physical  or 
biological  condition  in  a  person  or  raaterial  under  investigation, 
a  single  application  of  any  technique  of  measiirement  or 
observation  which  gives  dichotomous  outcomes  is  represented 
mathematically  by  a  simple  binary  experiment  (a,p).   If  such  a 
technique  is  applied,  with  statistical  independence,  n  times,  the 
experiment  is  binary  but  no  longer  simple;  its  mathematical  model 
Is  given  by  the  binomial  distributions: 

Prob  {x\e^)   =  (^)a^(l.a)^-^,  Prob  (x|h^)  =  (2)(l-p)''p""''  , 

X  =  0,1,  . ,.  n. 

We  denote  any  such  binary  experiment  by  the  symbol  (a,p)^, 

A  symmetric  simple  binary  experiment  is  one  of  the  form  (a,a). 
If  various  experiments  of  this  form  (vxithout  replication)  are 
possible  in  a  given  application,  these  admit  a  simple  ordering: 
( a,a)  is  more  informative  than  (a', a')  if  0  ^  a  <  a»  gp«   Corres- 
pondingly, an  outcome  from  (•^,3)  is  xminf ormat i ve  and  irrelevant  to 
the  hypotheses.   For  0  <  a  <  •^,  an  outcome  from  (a, a)  is  incompletely 
informative.  And  if  0  ^  a  <  a'  S  "^9   then  an  outcome  from  (a, a)  is 
more  informative  than  one  from  (a«,a')«  These  terms  concerning  the 
value  or  strength  of  a  specified  outcome  of  a  binary  experiment, 
as  evidence  relevant  to  the  hypotheses  H,  or  lip,   are  objectively 
defined,  mathematically  and  physically,  in  the  same  sense  as  are  the 
terms  of  modern  probability  theory  referred  to  above;  in  fact,  their 
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objective  character  may  be  viexiied  as  based  directly,  by 
definition,  on  the  mathematically  and  physically  objective  charac- 
ters of  the  symmetrical  simple  binary  statistical  experiments  (a, a), 

0  s  a  ^  ^. 

It  is  convenient  to  employ  the  following  (sufficient) 
statistic,  defined  on  the  sample  space  S  =  fxl  of  any  binary  experi- 
ment represented  by  any  elementary  probability  fxinctions  f,  (x), 
fp(x)  respectively  representing  H,  and  Hp: 

r  =  r(x)  =  log  f2(x)/f^(x)   , 

Then  in  the  case  of  any  symmetrical  simple  binary  experiment  (a, a), 
the  outcome  "negative"  gives  r  =  -  log  (l-a)/a  and  the  outcome 
"positive"  gives  r  =  log  (l-a)/a.   For  such  experiments,  the 
algebraic  sign  of  r(x)  represents  a  qualitative  property  of  any 
outcome,  as  favoring  either  H,  or  Hoj  while  the  absolute  value 
|r(x)  I  represents  on  a  convenient  scale,  from  0  to  oo,  its 
strength  as  evidence  relevant  to  H^  or  Hp,  with  the  value  oo  repres- 
enting a  completely  informative  outcome,  and  the  value  0  repres- 
enting a  (completely)  unlnf ormative  outcome.   The  interpretation 
of  r(x)  in  other  types  of  binary  experiments  remains  to  be  discussed* 

Example »      Consider  the  "mixture"  experiment  E'"'  defined  as 
follows:  V/ith  respective  probabilities  g  =  ol536,  g,  =  ,2Sl\l^, 
gp  =  .5920,  select  at  random  one  of  the  experiments  i»S,»5), 
(.0588,. 0588),  or  ( .0039, .0039);  and  obtain  a  single  outcome 
("positive"  or  "negative")  by  use  of  the  selected  experiment.   The 
discussion  above  shows  hovj  any  outcome  of  E""'"  should  be  interpreted, 
for  the  purposes  of  inference  being  considered  here;  such  inter- 
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pretatlon  depends  only  on  the  selected  simple  experiment  and  its 
outcome,  and  is  otherwise  independent  of  the  mathematical  structvire 
of  E'"'i  it  is  easily  verified  that  the  sufficient  statistic  r, 
defined  as  above,  of  the  mixture  experiment  E'"'  automatically  takes 
the  same  numerical  values  as  does  the  corresponding  statistic  r 
defined  on  any  selected  simple  experiment. 

Consider  alternatively  the  binomial  experiment  E:  (.2, ,2)^, 
with  possible  outcomes  x  =  0,l,.j>,[|.,   Consider  the  problem  of 
interpreting  corresponding  values  of  the  sufficient  statistic  r(x), 
as  evidence  relevant  to  H^  or  Hp,   Since  E  is  not  a  symmetrical 
simple  binary  experiment,  the  above  discussion  has  not  been  shovm 
to  be  relevant  to  the  interpretation  of  n-umerical  values  of  r(x). 
However,  it  is  a  mathematical  fact,  easily  verified,  that  E  is 
equivalent  to  E"'^  in  the  sense  that  the  sufficient  statistics  r  of 
the  two  experiments  have  the  same  distributions,  vmder  H^  and  Hp 
respectively,   (The  rational  numbers  required  to  define  E*"'  have 
been  given  here  only  to  four-decimal  accuracy, )   Under  this 
equivalence,  outcomes  of  E  and  of  E'-'  are  equivalent  if  and  only 
if  they  give  the  same  values  to  the  sufficient  statistics  r.   It 
follows  that  the  outcome  r(x)  of  E   should  be  interpreted  as  if  the 
same  numerical  value  r  had  been  obtained  from  a  symmetrical  simple 
binary  experiment.   The  scope  of  interpretations  of  values  r,  found 
above,  is  extended  in  this  way  to  the  present  experiment  E;  and 
in  this  sanse.,.  the  mathematical  structure  of  E  as  a  whole  becomes 
irrelevant  to  the  interpretation  of  outcomes,  once  the  value  of 
r(x)  is  given.   It  may  fairly  be  said  that  the  frame  of  reference 
of  these  interpretations  continues  to  be  the  mathematical  model  of 
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some  experiment,  namely  the  particular  simple  experiment  chosen  in 
the  mixtiire  experiment  E'"'  which  mathematical  analysis  shows  to  be 
equivalent  to  the  binomial  experiment  E, 

Consider  alternatively  a  different  mixture  experiment  E'"""" 
defined  as  follows:  With  respective  probabilities  g  =  ,l536, 
g^  =   ,14.232,  gi  =  ,14.232,  select  at  random  one  of  the  experiraents 
(.5, .5),  (.0037, .0623),  or  (,0623,.0037)i  and  obtain  a  single 
outcome  by  use  of  the  selected  experiment.   For  our  inference 
purposes,  any  outcome  of  this  experiment  should  again  be  interpreted 
with  the  selected  simple  experiment  as  the  frame  of  reference,  and 
for  these  purposes  the  form  of  E'""'"  as  a  whole  is  otherwise 
irrelevant.   However,  it  is  easily  verified  that  E'-"''"  is  mathe- 
matically equivalent  to  E'""  (in  the  sense  defined  above),  and  that 
each  outcome  of  E"""'"'  is  equivalent  to  a  certain  outcome  of  E'*'',   In 
particular,  the  outcome  "positive"  from  ( ,0037, .0623)  in  E''""'  is 
equivalent  \inder  this  correspondence  to  the  outcome  "positive"  from 
(.0039,,  0039)  in  E'"",  and  the  outcome  "positive"  from  ( ,0623,  .0037 ) 
in  E'"""""  is  equivalent  to  the  outcome  "positive"  from  ( ,0568, ,0588) 
in  E'"-. 

Thus  v/e  have  found  that,  as  a  frame  of  reference  for  inter- 
preting outcomes,  "the  selected  simple  experiment"  in  any  mixtiire 
experiment  is  clearly  more  relevant  and  objective  than  the 
structure  of  the  mixture  experiment  as  a  whole j  and  yet  the 
objectivity  of  ".the  selected  simple  experiment"  is  in  a  sense 
Illusory,  since  in  different  but  equivalent  mathematical  models  we 
find  different  simple  experiments  serving  equally  well  as  "objective" 
frames  of  reference  for  interpretations  of  the  same  outcome.   V/hat 
is  in  fact  both  objective  and  essentially  relevant  for  such  inter- 
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pretations  is  only  the  numerical  value  of  the  sufficient  statistic 
r  on  the  observed  outcome,  ^^fith  its  objective  interpretations  as 
given  above. 

The  generality  of  the  features  illustrated  in  this  example  is 
established  in  the 

Theorem,   Each  binary  experiment  is  equivalent  to  a  mixture 
of  simple  binary  experiments,  (Most  binary  experiments,  including 
the  binoraial  example  above,  can  be  represented  in  an  infinite 
number  of  different  forms  as  mixtures  of  simple  experiments,  ) 

Such  analysis  leads  to  the  following  conclusion:  For  problems 
of  statistical  inference  of  the  kind  described  above,  given  the 
n"umerical  values  of  the  likelihood  function  determined  on  the 
observed  outcome  of  any  specified  binary  experiment  (that  is,  given 
f,(x)  and  fpCx)  for  the  observed  x^  or,  more  concisely,  given 
r(x)  =  log  f2(x)/f^(x) ),  the  structure  of  the  experiment  as  a 
whole  is  irrelevant. 

One  result  of  this  analysis  Is  that  a  long-standing  point  of 
difference  between  Bayesian  and  non-Bayesian  statisticians  can  be 
in  part  resolved  as  follows:  for  problems  of  the  kind  considered 
here,  Bayesian  statisticians  can  agree  with  non-Bayesians  who  follow 
the  above  analysis  that  r-values  express  in  an  objective  sense  the 
relevant  evidence  from  the  experimental  outcome  Itself;  the  remain- 
ing questions  concern  only  the  various  possible  m.odes  of  inter- 
pretation of  r-values  in  various  inference  situations. 

The  structxire  of  any  experiment  _is  crucial  for  many  other 
kinds  of  problems  of  inference  or  decision-making  dealt  with  by 
mathematical  statistics.   And  even  for  the  kind  of  inference 
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problem  considered  here,  the  structure  of  an  experiment  Is  crucial 
in  the  sense  that  it  represents  the  design  of  an  experiment:  even 
if  the  interpretation  of  outcomes  vjill  leave  aside  the  structure  of 
an  experiment,  there  remain  the  crucial  problems  of  appraising, 
comparing,  and  choosing  experimental  designs  for  use  in  this  way, 
A  highly  informative  experiment  is  one  which  gives  with  high 
probability  highly  informative  outcom^es  (large  values  of  lr|, 
under  each  hypothesis).   It  is  not  clear  that  a  numerical  measure 
of  informatlveness  of  an  experiment  in  this  sense  is  necessary  or 
that  it  could  be  fully  adequate,  since  the  distributions 
of  r  under  respective  hypotheses  are  basic  and  directly  inter- 
pret able  • 
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de  la  statistique  math^matique. 
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1«   Introductlon«   There  Is  Increasing  awareness  among  applied 
and  theoretical  statisticians  that  many  problems  customarily 
formulated  in  terms  of  testing  statistical  hypotheses  can  be 
formiolated  more  appropriately  as  problems  of  estimation.   The  recent 
expository  paper  by  Natrella  [1]  describes  this  trend  and  some 
principal  reasons  for  it,  and  Illustrates  how  the  close  relation- 
ship betxveen  confidence  intervals  and  tests  facilitates  a  smooth 
shift  of  emphasis  from  the  techniques  and  concepts  of  testing  to 
those  of  estimation.   The  purpose  of  the  present  note  is  to  describe 
a  technique  of  estimation  by  confidence  curve Sj  which  more  formally 
incorporates  the  practical  techniques  of  testing,  along  with  those 
of  point  estimation  and  estimation  by  confidence  limits  and 
confidence  intervals  at  various  levels.   In  one-parameter  problems, 
a  confidence  curve  estimate  can  be  interpreted  flexibly,  in  any 
context  of  application  for  general-p^^rpose  informative  inferences, 
so  as  to  provide  conveniently  any  number  of  valid  Inferences  of 
the  following  forms:   (a)  confidence  limits  and  confidence  inter- 
vals, at  various  confidence  levels,  and  a  point  estimate; 
(b)  significance  tests,  one-  or  ti\ro-sided,  of  particular  parameter 
values  representing  any  hypothesis  of  interest;  and  for  the  latter 
tests,  (c)  the  critical  level  of  Type  I  (that  is,  the  customary 
"P-level",  the  significance  level  at  which  the  observed  data  x^jould 
Indicate  rejection) j  and  also  (d)  at  each  parameter  value  repres- 
enting an  alternative  hypothesis  of  interest,  a  critical  level  of 
Type  II  (that  is,  the  analogue  of  the  customary  "P-level"  which 
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corresponds  to  errors  of  Type  II),  which  represents  the  power  of 

the  test  in  a  form  which  can  be  interpreted  conveniently  as  part 

of  the  over-all  interpretation  of  observed  data. 

2,   Definition  of  a  confidence  curve  estimate,  and  an  example > 
For  typical  problems  in  which  one  parameter  is  of  primary 

interest,  a  confidence  curve  estimate  is  defined  simply  as  a  set 

of  confidence  limits  at  various  confidence  levels. 

It  is  convenient  to  use  the  notation  t  for  the  observed  value 

of  the  appropriate  basic  statistic  in  any  specified  experiment. 

For  example,  if  n  independent  observations  y.  are  obtained,  and  if 

the  sample  mean  is  the  appropriate  statistic,  then 

t  =  y  =  )   '  y./n»   Let  Q   denote  the  unknovjn  value  of  the  parameter 

1=1   ^ 
of  interest.   Let  y  denote  any  fixed  nuiaber,  0  g  y  ^  1,   For  each 

Y  >  ,5,  let  0(t,Y)  denote  a  lower  confidence  limit  for  ©,  at  the 

Y  confidence  level,  based  on  the  observed  value  t.   For  each 

Y  <  ,5,  let  0(t,Y)  denote  an  upper  confidence  limit  for  Q,  at  the 
(1-y)  confidence  level,  based  on  t.   For  y  =  •5,  the  corresponding 
mathematical  definition  of  S(t,Y)  =  '3(t,,5)  can  be  interpreted  more 
usefully  as  follows:   ©(t,,5)  is  a  point-estimator  of  ©  which  is 
median-unbiased.   To  avoid  ambiguity,  it  is  convenient  to  replace 
the  usual  terra  "unbiased"  by  mean-unbiased,  to  refer  to  the  property 
that  an  estimator's  mean  value  is  the  true  parameter  value  being 
estimated,  A  median-unbiased  estimator  is  one  whose  median  is  the 
true  value  being  estimated;  that  is,  a  median-unbiased  estimator 
has  probabilities  of  overestimation  and  of  underestimation  each 
equal  to  "jo 
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All  of  these  definitions  are  suinmed  up  by  stating:   For  each 
y,  whatever  may  be  the  true  value  o   of  the  parameter,  the  estimator 
©{t,Y)  has  the  basic  property  that  its  value  is  less  than  0  with 
probability  equal  to  y  (and  hence  its  value  exceeds  Q  with 

probability  (1-y)  we  leave  aside  the  minor  technicalities  of 

cases  where  estimators  have  discontinuous  distributions).   In 
typical  problems,  the  usual  definitions  of  confidence  limits 
provide  the  following  additional  property:   For  each  possible 
observed  t,  as  y  decreases  from  1  to  0,  the  respective  values  of 
the  estimates  ^{t,x)    increase  continuously  through  the  range  of 
possible  values  of  d* 

The  manner  of  computing  and  reporting  such  sets  of  estimates 
will  naturally  vary  with  problems  and  purposes.   One  form  which  is 
often  convenient,  and  for  which  the  terra  confidence  curve  seems 
particularly  appropriate,  may  be  defined  as  follows  for  typical 
problems;   If  a  standard  confidence  limit  method  is  applied  to  a 
given  observed  value  t,  each  of  the  possible  values  of  6  will  be 
a  lower  confidence  limit  at  some  confidence  level  and  also  an 
upper  confidence  limit  at  some  corresponding  level  (1-y);  for 
each  ©,  let  c(P,t)  denote  the  smaller  of  these  ti/o  values,  y  or 
(1-y)«   Then,  for  any  observed  value  t^   as  0  increases  through  its 
range,  the  confidence  curve  c(Q,t)  will  ii.creas:^  continuously  from 
0  to  ^,  and  then  decrease  continuously  to  0,  An  alternative 
definition  of  the  confidence  curve  c(0,t)  is  the  following:   given 
the  observed  t,  for  each  Q  the  value  of  c(0,t)  is  the  smaller  of 
the  critical  levels  of  the  two  one-sided  tests  of  the  hypothesis 
that  the  true  value  of  the  parameter  is  fi,  one  against  larger 
alternatives  and  the  other  against  smaller  alternatives* 
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Figure  la.   Confidence  curve  estimate  of  a  binomial  proportion 
p  based  on  n  =  75  observations  and  an  observed 
proportion  p  =  x/n  =  l|$/75  =  "^  . 
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Figure  lb.   Graphs  of  some  percent,  points  of  p  =  x/n. 
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The  typical  form  of  a  confidence  ciorve  estimate  is  illustrated 
In  Figvire  la,  which  is  the  graph  of  such  an  estimate  of  a  binomial 
mean  (proportion)  p,  based  on  n  =  75  observations  and  an  observed 
proportion  p  =  Ij-5/75  =  •6.  The  general  formula  for  a  confidence 
curve  estimator  of  a  binomial  proportion  p,  given  an  observed 
proportion  p  =  x/n  based  on  n  observations,  using  the  usual  normal 
approximation,  is 


>(p,p)  =i  (-  \/h|p-p|  /yp(i-p)j 


where  ^  denotes  the  standard  normal  cumiolative  distribution  fiinction. 
(Here  Q   becomes  p,  t  becomes  p  =  x/n,  and  the  confidence  limit  0(t,Y) 
may  be  designated  p(p,Y)*  o^  p(x/n,Y)#   In  this  notation,  the  usual 
(mean-vinbiased)  point  estimator  of  p  is  x/n  5  p  =  p(p,»5)»   The 
discreteness  of  distributions  of  t  in  such  problems  represents  a 
minor  theoretical  and  computational  complication;  except  with  very 
small  sample  sizes,  the  quantitative  and  theoretical  significance 
of  such  complications  is  minor,  and  it  seems  appropriate  for 
typical  purposes  of  informative  inference  to  use  the  usual 
continuous  approximations  to  the  distributions  involved.  VJhen 
desired,  such  approximations  can  be  replaced  by  exact  probabilities 
taken  from  tables  of  the  binomial  distribution;  this  is  advisable 
for  p  or  ^  near  0  or  1,  and  more  generally  when  n  is  small*   In  the 
present  example  we  have 

c(p,.6)  =  J   r-(8„66)jp-.6|/\/p(l-p)'\  , 

which  may  be   evaluated,    using   tables   of  ]^,   for  as  many  values   of  p 
as  desired  to   provide   a  sketch   of   the    confidence   curve  estimate. 
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An  alternative  graphical  method  of  construction  of  such  a 
confidence  curve  estimate  is  illustrated  in  Figure  lb.  This 
method  is  clearly  applicable  in  any  problem  for  which  graphs  (or 
corresponding  tables)  of  various  quantiles  (percent,  points) 
of  the  basic  statistic  are  available.   The  figure  also  illustrates 

further  the  definition  of  a  confidence  curve  estimate.   Figure  lb 

1/2 

contains   graphs   of  the    functions    p  -  k(p(l-p))  '    *   for 

i  k  =  .277,    .196,    .124,    .062,   and  0,      (These  values  of  k  correspond 
to  some   of  the    graphs   contained   in  the   charts   of   95  %  confidence 
belts   of   Glopper  and   Pearson   [2],   namely  those   labeled  50*   100, 
250,   1000,   and   the    central  diagonal  line.)      Each  of   these   fxmctions 
gives  quantiles   of  x/n  as  a   fvmctlon  of   p   (based  on  the   usual 
normal  approximation).      For  n  =  75*   these   are   the    ,008,    .Oij.5,    .111.2, 
.295,    .5,    .705,    .858,    .955,   and   .992  quantiles   of  p  =  x/75.      The 
inverses   of   these    functions,   easily  read   graphically,   give   upper 
and   lower  confidence   limits  based  on  any   observed  value;    in  the 
present   example,  we    obtain  in  this     way,   as   indicated  by  arrows   in 
the    figure,   the    following  point  estimates,   as   a  basis  for  a 
sketch  of  the   complete   confidence    curve   estimate:   lower  99.2% 
limit:    •liSi  lower  95.5  %  limit :    .50;    ...I  niedian-unbiased  estimate 
.60;  upper  70.5%limit:    .63;   ...;  upper  99.2%  limit:    •74« 
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3«   Interpretations  of  confidence  curve  estimates.   The  range 
and  flexibility  of  possible  interpretations  of  a  confidence  curve 
estimate  can  be  illustrated  by  considering  another  example,  shown 
in  Figure  2,   This   confidence  curve  is  an  estimate  of  the  mean  © 
of  a  normal  distribution  with  known  standard  deviation  cr=  5> 
based  on  a  sample  of  n  =  Ij.  independent  observations  y.  whose  sample 
mean  is  t  =  y  =  5»   Thus  Figure  2  is  the  graph  of  the  confidence 
cx:irve  estimate  c(Q,t)  =  c(0,5)  =  ]^(-(  •![.)  lo-5 1  )•   (The  same  estimate 
would  arise  in  the  following  different  problems:   Let  9   =  \ip-[i.^    be 
the  unknown  difference  between  two  raeans  \i.    of  normal  distributions 
with  known  standard  deviations  cr--,  s   cri;  and  suppose  that  a 
difference  of  independent  sample  means  yp  -  y-j  =  5  is  observed, 
based  on  sample  sizes  n,,n2,  such  that 


\/a-l/n^   +or|/n2  =  2.5  =  l/(.4).) 


The   following  are   examples   of   tte    inference    statements  about    @, 
incorporated   in  this  confidence  curve   estimate,   which  may  be   read 
by   inspection  of   Figure   2; 

1.  A   point  estimate   of   Q  is   5«      (Here   and   in  many  common  examples 
the   best  median-unbiased  estimate   obtained   in  this  way 
coincides  exactly   or   very  nearly  with   standard  estimates  based 
on  the    criterion  of  mean-unbiasedness   or  that   of  maximum 
likelihood,   except   for  very   small    sample    sizes   in  some 
examples. ) 

2,  An  upper   .90  confidence   limit  for   Q  is    8.5   • 


iiO      iS'  \ii'\  \>C' 


Figure  2.   Confidence  curve  estimate  of  a  normal  mean  6 
based  on  a  sample  mean  =  5>  having  a  standard 
error  =  2*5  • 
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c(©,t) 


-2.5 


3«        A    ,99   confidence    interval   for    Q  is:    -l.Lj.  to  11, l\.   • 

i|..        In  testing  the   hypothesis   H    :    o  =  o  against   the    one-sided 

alternative   H, :    Q  >  0,  we  would    just  reject  H     at   the    »025 

significance   level,      (Hence  we  would  not   reject   H     at   the    ,01 

level,   but  would   reject   it    at  the    ,05  level.) 
5«        In  testing  H   :    Q  =   0  against   the    one-sided  alternative 

H,  :    Q  <  0,  we   would  accept   H     at  any  of  the   usual   significance 

levels. 
6,        In  testing  H    :    Q  =  0  against   the   tvjo-sided  alternative 

H^;    0  ^  0,  we  would    just  reject   H     at  the   .05(=2.( .025) ) 

significance   level    (but   at  no   smaller  level). 
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7.  The    information  conveyed   in  the    preceding  three    inference 
statements    {[|.)-(6)   which  are    in  hypothesis-testing  form,    can 
be   expressed  alternatively   in  confidence -limit    statements  as 
follows:      The   value    0  is   a   .975-level   lox>;er  confidence   limit 
for  Q,      (Tliat   is,  we   have  moderately  high  confidence    that  the 
true   value    of  9   is  not   as   small   as    0»      Our   confidence   in  this 
inference    is   not   as    strong  as   is  represented  by  the    ,99  con- 
fidence  level,   but    is    stronger  than  is   represented  by  the    #95 
confidence   level.) 

8.  The    one-sided  test   of    (l|)     above,  which  rejects   H   :    8  =  0  in 
favor   of   larger  alternatives   at   the    «025   significance   level, 
has   a   Type   II   critical  level   of  approximately    .16   at   the 
alternative   hypothesis    Q  =  7«5   (corresponding  to  power  =   «8I|. 
against   this   alternative).      If  alternatives   of   this  or  larger 
magnitudes  are    of   interest,   the    observed  data  may  be   considered 
fairly   strong  evidence   against  H^  favoring   such  alternatives, 
since   outcomes  with  sample  means   at   least  as   large  as  that 
observed  have   relatively   small   probability    (,025)   under   H 

but   relatively  large   probability   (.81].  or  greater)   under  such 
alternatives. 
9«        For   the   same    test,   at   the    alternative   hypothesis   9  =  1  vje   have 
a  Type    II  critical   level   of  approxim.ately    .95   (corresponding 
to  power   =   .05  only).      If  alternatives   of  about  this   inagnitude 
are   of  interest,    the   observed  data   cannot  be    considered  very 
helpful,    since   outcomes  with   sample   means  at   least   as   large 
as   that   observed   have    small   probabilities   of    similar 
magnitudes  under   the   different    hypotheses   of   interest 
(©  =   0   or   9  =  1), 
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h»      Discussion.   Systematic  use  of  estimators  consisting  of 
sets  of  confidence  limits  at  various  levels  has  been  proposed  by 
Tukey  [3]  and  by  Cox  [i^.],  for  reasons  generally  similar  to  those 
described  above.   The  particular  form  for  such  estimators  proposed 
above,  confidence  curves,  seems  to  serve  conveniently  for  typical 
general  pxir poses  of  informative  Inference,  when  based  on  standard 
current  statistical  techniques  as  illustrated  above.   In  addition, 
this  form  of  definition  of  an  (omnibus)  estimator  serves  well  in 
some  extensions  and  unifications  of  the  theory  and  practical 
techniques  of  estimation  which  will  be  published  separately  [51. 
The  latter  include  certain  generalizations  and  justifications  of 
maximum  likelihood  methods,  and  unification  of  the  latter  with  the 
theory  and  techniques  of  confidence  limit  estimation* 

The  problem  of  interpreting  significance  test  results  so  as 
to  distinguish  appropriately  between  formal  statistical  signi- 
ficance on  the  one  hand,  and  practical  significance  in  a  specific 
context  of  application  on  the  other  hand,  is  met  in  a  simple  but 
helpful  way  by  use  of  confidence  curves,  as  Illustrated  by  points 
(8)  and  (9)  of  the  preceding  section.   The  relatively  Informal 
comments  given  in  that  section  to  illustrate  the  Interpretations 
of  an  observed  outcome,  as  evidence  relevant  to  the  various  statis* 
tical  hypotheses  considered  seem  to  give  unified  and  explicit 
form  to  much  current  practice  of  applied  statistics  based  on  the 
generally  accepted  principles  and  foundations  of  the  theory  of 
Neyman  and  Pearson.  At  the  same  time,  the  writer  feels  that  these 
foundations  of  statistical  inference  will  bear  further  discussion, 
which  will  be  offered  elsewhere  [6],  and  in  which  certain 
refinements  and  modifications  of  current  theoretical  formulations 
and  practical  techniques  will  be  proposed. 
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For   any   specific   application,   the    form  and   corapleteness   in 
which  a   confidence   ci^'ve   estimate    is   reported   can  of  course   vary 
greatly.      For  many   purposes  a  very  rough  sketch  based  on  only 
several   computed  points  will   suffice.      And   of  co-urse   for  many 
purposes  the  more    standard  techniques,   either  tests   or  estimates 
(of  the    point,  confidence   limit,    or    confidence    interval   form)   may 
well   svifficej   from  a   formal   theoretical   standpoint,    it   may   sometimes 
be   useful  to  regard  any   one    of   these    standard  techniques  as  an 
incomplete   description,    of   an  \inderlying  complete   confidence   curve 
estimate,   which  sioffices   for   a   particular  application.      This 
standpoint   is   helpful   in  avoiding   interpretations   of   standard 
techniques  xjhich  are   tied   too   formally  to  chosen  fixed  confidence 
or   significance   levelss  interpretations   v/hich   seem  inappropriately 
schematic    in  typical   contexts   of    informative   inference. 
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